This map has multipe layers: one for potential customers (colorful) and others for areas we already have covered (dark, hiding the colorful layer below). By stacking these layers, we can see expansion opportunities: areas with potential customers where we haven't sold yet.
To toggle the layers, use the layers button in the upper left corner.
With Sales data on top (in black) hiding the potential customer layer below (colorful), what remains are expansion opportunities: areas of high potential customer density where we have yet to sell many products.
Clicking on the map shows the name of the county, and clicking the 'x' closes this pop-up.
mymap #show map
# Population Data:
# Energy Access Explorer
# https://www.energyaccessexplorer.org/tool/s/
# --> Tool
# --> Country: Kenya
# --> Geography Kenya
# --> Expansion of Clean Energy Markets
# --> click menu (top right)
# * remove mini-grids
# * remove lower bound on distance to distribution lines
# * remove lower bound on poverty
#
# multi-criteria analysis described here:
# https://files.wri.org/d8/s3fs-public/energy-access-explorer-data-and-methods.pdf
# Sun King Sales Data
country = 'Kenya' # Capitalized
sales_interval = '1 year' # Redshift SQL-compatible interval
sql1 = f"""
SELECT
accounts.angaza_id,
accounts.area_derived as area,
accounts.region,
accounts.country,
COALESCE(
NULLIF(accounts.unit_location_latitudelongitude,''),
NULLIF(accounts.registration_location_latitudelongitude,'')
) AS latlong, -- get coordinates from whatever column has it
split_part(latlong,',', 1)::DECIMAL(6,4) AS "unit_lat",
split_part(latlong,',', 2)::DECIMAL(7,4) AS "unit_long"
FROM easybuy.accounts_final_pdt AS accounts
WHERE split_part(latlong,',',2) <> '' -- check for valid gps
AND accounts.registration_date_utc::DATETIME > GETDATE() - INTERVAL '{sales_interval}'
AND accounts.country = '{country}'
-- AND product_group LIKE 'SHS%'
-- AND region = 'Nyanza' -- Kenya
-- AND area in ('Akoko') -- Nigeria
LIMIT 1e7
"""
# sales_minimum_per_agent = 50 # show agents with at least this many sales
# sql2 = f"""
# SELECT base.angaza_id as agent,
# base.country,
# base.latitude ::DECIMAL(6,4) as base_lat,
# base.longitude::DECIMAL(7,4) as base_long,
# COUNT(accounts.angaza_id) as count_sales
# FROM kazi_prod.fse_base_location as base
# LEFT JOIN easybuy.accounts_final_pdt AS accounts
# ON base.angaza_id = accounts.registering_user_angaza_id
# WHERE accounts.registration_date_utc::DATETIME > GETDATE() - INTERVAL '{sales_interval}'
# -- todo
# # todo: active agents filter
# AND base.country = '{country.upper()}'
# GROUP BY agent, base.country, base_lat, base_long
# HAVING count_sales >= '{sales_minimum_per_agent}'
# LIMIT 100000
# """
con = con
sql2 = f""" -- this query occasionally returns zero rows, proximal problem seems to be in db
WITH agent_status_pdt AS (SELECT country,
responsible_user_angaza_id,
MAX(registration_date_utc) as last_sale
FROM easybuy.accounts_final_pdt as accounts
WHERE country = '{country}'
GROUP BY 1,2)
SELECT base.angaza_id as agent,
base.country,
base.latitude ::DECIMAL(6,4) AS base_lat,
base.longitude::DECIMAL(7,4) AS base_long
FROM easybuy.users AS users_all
LEFT JOIN kazi_prod.fse_base_location as base
ON users_all.angaza_id = base.angaza_id
LEFT JOIN agent_status_pdt
ON users_all.angaza_id = agent_status_pdt.responsible_user_angaza_id
WHERE DATEDIFF('DAYS',last_sale, CURRENT_DATE) < 60 -- is_active_agent
AND agent IS NOT NULL
AND agent_status_pdt.country = '{country}'
--AND agent <> NaN
"""
df_base2 = pd.read_sql(sql2, con)
df_base2.set_index('agent', inplace=True)
print("\n\n\n# of agents: ", df_base2.shape[0], "\n\n\n")
df_base2.head(2)
# of agents: 10042
| country | base_lat | base_long | |
|---|---|---|---|
| agent | |||
| US000493 | KENYA | 0.1912 | 34.5292 |
| US000491 | KENYA | -21.5157 | 0.7899 |
# Create a raster with 5km x 5km cells
res = 5000 # Resolution in meters
cutoff_sales_num = 50
improvements
agreed:
dan's ideas:
check if all agents sell all products
subcounties and sub-sub counties: